AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
CLIP multimodal fusion

# CLIP multimodal fusion

Kandinsky 2 2 Decoder
Apache-2.0
Kandinsky 2.2 is a text-to-image generation model based on best practices from Dall-E 2 and latent diffusion models, utilizing CLIP as the text and image encoder to enhance visual expressiveness.
Text-to-Image
K
kandinsky-community
15.44k
64
Kandinsky 2 1 Inpaint
Apache-2.0
Kandinsky 2.1 is a text-to-image generation model based on best practices from Dall-E 2 and latent diffusion models, utilizing CLIP as the text and image encoder to enhance visual expressiveness.
Text-to-Image
K
kandinsky-community
2,268
9
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase